Auto-vectorization of interleaved data for SIMD
نویسندگان
چکیده
منابع مشابه
Efficient SIMD Vectorization for Hashing in OpenCL
Hashing is at the core ofmany efficient database operators such as hash-based joins and aggregations. Vectorization is a technique that uses Single Instruction Multiple Data (SIMD) instructions to process multiple data elements at once. Applying vectorization to hash tables results in promising speedups for build and probe operations. However, vectorization typically requires intrinsics – low-l...
متن کاملSIMD Vectorization of Straight Line FFT Code
This paper presents compiler technology that targets general purpose microprocessors augmented with SIMD execution units for exploiting data level parallelism. FFT kernels are accelerated by automatically vectorizing blocks of straight line code for processors featuring two-way short vector SIMD extensions like AMD’s 3DNow! and Intel’s SSE 2. Additionally, a special compiler backend is introduc...
متن کاملVectorization of Multigrid Codes Using SIMD ISA Extensions
Motivated by the recent trend towards small-scale SIMD processing, we have addressed in this paper the vectorization of multigrid codes on modern microprocessors. The aim is to demonstrate that this relatively new feature can be beneficial not only for multimedia programs but also for such numerical codes. As target kernels we have considered both standard and robust multigrid algorithms, which...
متن کاملVectorization of the 2D Wavelet Lifting Transform Using SIMD Extensions
This paper addresses the vectorization of the lifting-based wavelet transform on general-purpose microprocessors in the context of JPEG2000. Since SIMD exploitation strongly depends on an efficient memory hierarchy usage, this research is based on previous work about cacheconscious DWT implementations [1,2,3]. The experimental platform on which we have chosen to study the benefits of the SIMD e...
متن کاملA SIMD Approach to Thread Matching for Interleaved Multithreading
Interleaved multithreading processors offer improved performance and power efficiency in a multithreading environment compared to standard CPUs by allowing multiple threads to share a single processing pipeline. However, resource contention is a natural result of such a system and can determine how well the overall thread group performs on the processor. Selecting threads which perform well tog...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: ACM SIGPLAN Notices
سال: 2006
ISSN: 0362-1340,1558-1160
DOI: 10.1145/1133255.1133997